COMBINE 2014 - Abstracts

Vector NTI Express Designer as a demonstration of SBOL interoperability with COMBINE standards

Kevin Clancy

Vector NTI Express designer is a desktop bioinformatics software package. It provides an array of tools designed to support collection and analysis of public and private data; design of experimental constructs and assays through cloning, primer design, etc; tracking, procurement and management of clone construction, and; testing and confirmation of cloned sequences. Synthetic biology is a new approach to engineering biology, relying upon the use of functionally defined sequences that can be used and reused in a predictable and standardized fashion. The software integrates the traditional tools and workflows of sequence analysis with synthetic biology and supports synthetic biology workflows, particularly around the development and use of Part, Device and Circuit collections, scanners for elements like Ribosome Binding Sites, RNA secondary structure and terminators and codon optimization tools. In developing this software we used SBOL as an exchange medium for Part and Device information and SBOL visual to develop the graphical canvas for Part and Device development . We use Sequence Ontology as a means to identify and manage the functional roles of Parts and Devices. We use SBGN symbols to identify the functional interactions of Devices organized into Circuits or Devices with pools of molecules within the cell. The utility of the software is largely through the combination of multiple standards into an integrated design tool for synthetic biology. As the SBOL standard moves from a purely DNA based synthetic part system to other biological sequence moieties and use of models to represent their functional performance, we discuss how the COMBINE standards may be used to support the evolving model.

The transcription factor titration effect: A statistical mechanical description of promoter regulation

Robert Sidney Cox III

Transcription factors (TFs) with regulatory action at multiple promoter targets is the rule rather than the exception, with examples ranging from the cAMP receptor protein (CRP) in E. coli that regulates hundreds of different genes simultaneously to situations involving multiple copies of the same gene, such as plasmids, retrotransposons, or highly replicated viral DNA. When the number of TFs heavily exceeds the number of binding sites, TF binding to each promoter can be regarded as independent. However, when the number of TF molecules is comparable to the number of binding sites, TF titration will result in correlation (“promoter entanglement”) between transcription of different genes. We develop a statistical mechanical model which takes the TF titration effect into account and use it to predict both the level of gene expression for a general set of promoters and the resulting correlation in transcription rates of different genes. Our results show that the TF titration effect could be important for understanding gene expression in many regulatory settings.

NeuroML: Model Exchange in Computational Neuroscience

Sharon Crook

The Neural Open Markup Language project, NeuroML, is an international, collaborative initiative to develop a language for describing and sharing complex, multiscale neuron and neuronal network models. The project focuses on the key objects that need to be exchanged among software applications used by computational neuroscientists. Examples of these objects include descriptions of neuronal morphology, the dynamics of ion channels and synaptic mechanisms, and the connectivity patterns of networks of model neurons. This modular approach brings additional benefits: not only can entire models be published and exchanged in this format, but each individual object or component, such as a specific calcium channel or excitatory synapse, can be shared and re-implemented in a different model. We will provide an overview of the latest developments in NeuroML including NeuroML version 2.0, the NeuroML Model Database, and the use of the NeuroML language for model exchange at Open Source Brain.

Numerical Markup Language and Data Converter

Joseph Olufemi Dada

Data are essential component of systems biology research. They are consumed and generated by software tools and experiments conducted in the laboratories, as well as used for analysis and parameterization of biological models. Data representation in different formats makes it difficult to automate data exchange between experimentalists and modelers on one hand, and between different software tools on the other hand. The aim of Numerical Markup Language (NuML) is to provide a standardized and universal way for exchanging and archiving numerical data. A library (libNUML) for reading, writing, manipulating and validating data encoded in NuML has been developed in C/C++ with bindings in Java and python languages. However, data converter may still be needed to ease the problem of those that are used to keeping their data in different formats, such as excel, tab delimited and comma separated formats, etc. NuML will be briefly presented and the need for data converter if any will be discussed.

Modeling WUSCHEL domain expression in the shoot apical meristem of Arabidopsis thaliana

Michelle DeVost and Brianna Amador

We present a model of WUSCHEL (WUS) domain expression in the shoot apical meristem (SAM) of Arabidopsis thaliana. WUS is a transcription factor that is known to be involved in multiple functions in the meristem, including cell division, differentiation, and organogenesis. A self-regulating system involving CLAVATA3 (CLV3), the cytokinins (CK), and several other signaling molecules maintains the organizing center, or stem cell niche, in which WUS is expressed. While WUS is known to be an essential protein in the process of differentiation, the underlying mechanisms that preserve the position and size of its expression domain throughout plant growth are not yet fully understood. The signaling network represented in this new model was simulated deterministically using Cellerator/Cellzilla. Simulations were performed on a two-dimensional parabolic lattice of 77 cells, involving 35 reactions per compartment and an additional 2130 diffusion reactions for a total of 4825 reactions. The output exhibited effective maintenance of the stem cell niche. An SBGN process description diagram of this model was generated using CellDesigner. An SBML model of the basic (single cell) network is also available.

Using MultiCellDS and digital cell lines to initialize large-scale 3-D agent-based cancer simulations (up to 0.5M cells)

Samuel Friedman

Understanding and predicting cancer progression requires detailed interacting models of tumor and stromal cells, all calibrated to experimental data. Work to date has been limited by a lack of standardization of data representations of multicellular systems, though this is now being addressed through MultiCellDS (MultiCellular Data Standard) and digital cell lines, which are standardized representations of microenvironment-dependent cell phenotypes. Computational cancer modelers require biologically and mathematically consistent initialization routines to seed simulations with cells defined in digital cell lines. In this talk, we will briefly introduce a 3-D agent-based model designed for use in integrative computational biology. We introduce a “snapshot generator” that can take a digital cancer cell line and produce for the agent-based model an initial cell arrangement and a phenotypic state based upon analyses of the digital cell line data elements. We demonstrate 2-D monolayer and 3-D hanging drop simulations up to 500k MCF7 cells, a common breast cancer cell line. We additionally demonstrate the production of digital snapshots, standardized simulation output that will facilitate computational model comparison with a common core of analytical tools. With an early version of these tools, we assess the match between simulations and in vitro experiments. In the future, this work will be used to create and simulate combinations of tumor and stromal cells from appropriate digital cell lines in realistic tissue environments in order to understand, predict, and eventually control cancer progression in individual patients.

From grassroots to common standards

Martin Golebiewski

The whole data lifecycle in systems biology requires a high degree of standardization of formats for data, models and metadata: Acquisition and description of the data, processing and analysis, their efficient and secure exchange, data integration and incorporation into computational models, as well as the setup, handling and simulation of the models that all have to follow dedicated standards. To this end many grassroots community standards for exchange formats and metadata description have been defined by different scientific communities in the field. However, it often is confusing and cumbersome for the potential users (e.g. experimentalists or modelers) to find the appropriate standards for their tasks. In my talk I will outline options for classifying and transforming existing community standards into official norms and how that might help in broadening the user acceptance.

Data upload into SABIO-RK via SBML

Martin Golebiewski

SABIO-RK (http://sabio.h-its.org/) is a database for biochemical reactions and their kinetic properties. It includes kinetic parameters and kinetic rate laws in relation to their biological sources and experimental conditions. One focus of SABIO-RK is to support the computational modelling of biochemical reaction networks. The data are mainly extracted from literature, stored in a structured format, and annotated to external database identifiers, controlled vocabularies and ontologies. The database can be accessed either manually via a web-based search interface or via RESTful web services that allow direct automated access by other tools. Both interfaces support the export of the data together with its annotations in SBML (Systems Biology Markup Language) complying with the MIRIAM standard. To directly submit experimental results from laboratories, as well as calculated kinetic data from model simulations to SABIO-RK we now have implemented a web service for the uploading of data in SBML format into our web-based input interface. Dependent on the quality of the SBML file and the availability of annotations to ontologies and other databases the data are already mapped during the parsing and insertion process.

Reactome Knowledgebase - Linking biological pathways, networks and disease.

Robin Haw

Reactome (http://www.reactome.org) is an open access, open source, curated and peer-reviewed pathway knowledgebase. As one of the gold-standard pathway databases, it has been widely used in many pathway and network based data analysis projects. The data model used behind the Reactome database has undergone some recent changes to support the annotation of disease pathways and processes. We have released a new Reactome Pathway Browser with an integrated suite of tools for pathway analysis. These updates have migrated into our data exchange files, including BioPAX, PSICQUIC, SBGN, and SBML formats. In addition, several APIs are available to access the Reactome contents programmatically. Recently, a RESTful API was developed to support the backend services for the new Reactome web based applications.

Identifiers.org: cross-referencing and integration tool for heterogeneous datasets

Nick Juty

The provision of a service to generate robust and perennial identifiers for data records in the Life Sciences must address two major issues: a) A way to associate a record with a dataset: data providers identify records using identifiers which may only be unique within their dataset (e.g. '9606' identifies "Homo sapiens" in the NCBI Taxonomy, but also identifies "Catha edulis" in the plant-based GRIN Taxonomy). b) Since data records are often distributed by multiple resources, or may be mirrored, different URLs may be used to reference individual records. Moreover, these URLs often change over time. In order to address the issues of proliferation of ambiguous identifiers and non-perennial access URLs, an effort was launched in 2006 to provide a system through which appropriate URIs (Uniform Resource Identifiers) could be generated. The Identifiers.org resolving system is designed to support the use of HTTP URIs directly for both annotation and cross-referencing purposes. These URIs are directly incorporable in datasets, increase usability by software tools in their processing and display, and are resolvable by the end user, since they can actually be used as they stand in web interfaces. Moreover, these URIs are free, and provide unique, perennial and location-independent identifiers. The resolving system is reliant upon an underlying Registry which stores information on the various datasets (termed collections), assigned namespace (unique short string identifying the collections), and lists the resources (physical locations from where data records can be retrieved). This infrastructure is already used very successfully by, for example, the computational modelling community, which requires the ability to perennially record cross-references and links to external data records, despite the ever changing nature of the location of information on the web. It is also a core element of numerous Semantic Web data management and provision infrastructures, such as the OpenPHACTS project and the EBI RDF Platform. Here we describe the extension of the information recorded in the underlying Registry, and the services provided, which should greatly help in the integration of heterogeneous datasets. These incremental improvements to the infrastructure facilitate data integration independently of the URI scheme that may have been used originally, and allows the system to be more universally suitable for a wide range of use cases.

Jummp: a generic and modular model management platform for life sciences

Nick Juty

Computational models have been around for a few decades and are now a widely used tool for the study of complex systems in life sciences. For example, in the field of Systems Biology, community-driven standards backed up by tools have lead to a sharp surge in the number of models produced. Today, a large number of new models are created using parts from already existing ones or even extend previously published ones. In addition, due to the fact that they provide a convenient way to record the state of the art knowledge about given processes, models are a very good medium for scientific exchange. This is true for research, education and training purposes. This demonstrates that the public provision of models is necessary but not enough. In order to facilitate model reuse, the community needs trustworthy and reliable models encoded in standard formats and, to this end, resources such as BioModels Database [Li2010] or the Physiome Model Repository [Yu2011] have been created. However, more and more models are now developed collaboratively. For example, the Human metabolism global reconstruction [Thiele2013] is the fruit of a major collaboration and used innovative ways to tackle this huge task, such as organisation of jamborees annotation sessions. This shows the need for model management infrastructures which provide support for those new ways to build models, involving several people disparately located. Some have already started to tackle this issue, such as the Physiome Model Repository [Miller2011] or the Open Source Brain [Gleeson2012], which rely on existing control version systems to support worldwide collaboration. Additionally, the size, complexity and number of models generated is ever increasing. This puts a lot of burden on existing infrastructures and makes it difficult for search engines to supply meaningful results. Moreover, some models are now being produced semi-automatically, for example from pathways and reactions resources, Path2Models [Büchel2013] being one such example. These challenges are faced not just by the Systems Biology community. For instance, in the field of Pharmacometrics where the predominance of proprietary tools fosters the fragmentation of the landscape and thwarts model reusability, the Drug Disease Model Resources(DDMoRe) project [Harnisch2013] has been established with a mission to address these issues. Aware of all these needs and taking into consideration the limitations of the existing software infrastructures, we decided to collaboratively develop the next generation model management infrastructure: Jummp (JUst a Model Management Platform). It has been designed as a complete web based solution for model development, curation and provision. The platform features a modular architecture consisting of a small core, responsible for the interaction with the back-end storage infrastructure, on top of which various components can be plugged in at runtime based on configuration settings. The application employs a dual storage engine strategy in the shape of a database and a version control system for capturing all the necessary information. In order to facilitate the installation of the platform in a variety of environments, multiple relational database types and version control systems are supported, in a way transparent for the core of the application. Jummp is able to store not just models, but any file that aids their comprehensibility or expressiveness, be it a data file, a graphical plot or a descriptive document. General tasks pertaining to model management, including the storage, versioning, sharing, searching, retrieval or archival of models are abstracted from the model format, allowing the application to remain as generic as possible and to avoid unnecessary coupling. The content of the platform can be explored through a user interface, as well as programmatically, via a RESTful API supporting XML and JSON. As models are updated, users are still able to access previous versions, providing insight into the evolution of their development. Multiple model formats are supported by means of different plugins that adhere to the same contract. These plugins are used to detect the format of the submission and assist with the validation of the model, but are also in charge of extracting information from the model or influencing the generation of a comprehensive specialised display. The platform comes with support for a range of model formats including SBML, PharmML and the COMBINE archive. Additional standards can be supported very easily if required. Secure access to models is one of the key traits of Jummp. A user has a fine-grained control mechanism at their disposal for choosing which person or group of people can access a model submission and the extent to which each collaborator can modify the submission’s content or visibility. The access control matrix underpinning the authorisation framework of the platform allows to further restrict certain features to more specialised user roles such as administrators or curators. A glimpse into the platform’s capabilities is provided by the DDMoRe Model Repository (http://ddmore.eu), the first public repository running Jummp. In the near future, the platform will be also used to power BioModels Database. Those two resources showcase the versatility of Jummp’s infrastructure

BioModels Database - A public repository of mathematical models of biological processes

Nick Juty

BioModels Database is a well-established public resource for mathematical models of biological processes. It hosts models of varying dimensions, encompassing simple biochemical reaction systems, larger complex dynamic models, metabolic network models, and Flux Balance Analysis (FBA) models. The resource provides access to over 1000 literature based models and over 140,000 models automatically generated from pathway resources. The model components, structure and behaviour of a large proportion of literature based models are manually validated to ensure correspondence with the original publication, with respect to the biological process that is represented, and the simulation results that are generated. Individual model components are cross-linked to external database resources and ontologies, allowing the unambiguous description of model content. Models are stored natively in Systems Biology Markup Language (SBML), but additionally are served in several other common formats, such as BioPAX, XPP, Octave (m-file), SciLab and PDF.

SBOL Stack: The One-stop-shop to Storing and Publishing SBOL Data

Curtis Madsen

Recently, synthetic biologists have developed the Synthetic Biology Open Language (SBOL), a data exchange standard for descriptions of genetic parts, devices, modules, and systems. The goals of this standard are to allow researchers to exchange designs of biological parts and systems, to send and receive genetic designs to and from biofabrication centers, to facilitate storage of genetic designs in repositories, and to embed genetic designs in publications. In order to achieve these goals, the development of an infrastructure to store, retrieve, and exchange SBOL data is necessary.

To address this problem, we have developed the SBOL Stack, a Sesame Resource Description Framework (RDF) database specifically designed for storing and publishing of SBOL data. The SBOL Stack can be used to publish a library of synthetic parts and designs as a service, to share SBOL with collaborators, and to store designs of biological systems locally. It includes a web client that allows users to upload new biological data to the database and to perform SPARQL queries to access desired SBOL parts.

To facilitate exchange, instances of the SBOL Stack can be installed by researchers at various organizations. Users can then register these different instances of the SBOL Stack with their own instance and perform federated queries over all registered databases. These queries allow users to retrieve and compile more complete data from multiple databases without the need to manually query each repository individually. In fact, the SBOL Stack can register any Sesame RDF database, so other repositories that contain information about biological parts can be included in the federated queries. It is this automatic retrieval and integration that makes the SBOL Stack a must have tool for researchers working on the design of systems in synthetic biology.

An introduction to the DiseaseMap project: step one

Alexander Mazein

Understanding the common and diverse molecular themes implicated in causing respiratory diseases requires powerful solutions for multi-scale data integration and systematization of available knowledge. The recently published Alzheimer’s disease map (Mizuno S. et al. BMC Syst Biol. 2012;6:52) and Parkinson’s disease map (Fujita, K.A. et al. Mol Neurobiol. 2013) point towards newly available opportunities of using computationally traceable maps for translational research. Here we present a respiratory disease map focused on severe asthma that captures relevant metabolic, signaling and gene regulatory events in the Process Description language of the SBGN (Systems Biology Graphical Notation) standard (Le Novere N. et al. Nat Biotechnol. 2009; 27(9):864). This project is developed in connection to the U-BIOPRED (Unbiased BIOmarkers in PREDiction of respiratory disease outcomes, www.imi.europa.eu/content/u-biopred) consortium effort. The map is designed to provide a biological context for multi-'omics data analysis, creates an environment for efficient knowledge sharing and enables mathematical modeling and simulation.

AsthmaMap in the SBGN standard: current challenges and possible solutions

Alexander Mazein

Here we would like to briefly introduce a disease map for severe asthma, present the tools we employ for developing diagrams in SBGN Process Description and Activity Flow languages, discuss the current challenges of a complex system representation with multiple cell types included.

Synthetic Biology Open Language Visual: graphical notation for forward engineering of biology

Jackie Quinn

Standardized graphical notation for communicating system design is a tool that is taken for granted in many engineering disciplines. Genetic engineering and synthetic biology, however, lack a standard visual design language and follows only loose convention for the depiction of genetic designs. The Synthetic Biology Open Language Visual (SBOLv) project is an effort toward developing a community-driven open standard for graphical representation of genetic designs. The goal of SBOLv is to support the development of effective and safe biological designs by helping biological designers and engineers communicate their ideas in a clear, concise, and consistent manner. Inspired by and related to other graphical notations in biological research, such as Systems Biology Graphical Notation (SBGN), SBOLv aims to serve the relatively new discipline of forward biological engineering, while maintaining a connection to the traditions of biological research. SBOLv version 1.0.0 is an initial set of symbols for describing the genetic parts that compose genetic designs. We show tools and methods that supporting the Synthetic Biology community’s involvement in growing and maturing SBOLv into a more comprehensive resource for graphical representation of genetic designs, and we hope that through active input from the synthetic biology and associated communities, the standard will become more comprehensive in the coverage of genetic parts and the support of regulatory interaction within biological systems.

Critical assessment of genome-scale metabolic reconstructions

Karthik Raman

A large number of genome-scale reconstructions of metabolic networks of several organisms have been published over the last decade. The reconstructions are used extensively in drug target identification, metabolic engineering etc. These models are widely exchanged via SBML or XLS; however, the SBML usually involves an ad hoc format, combining ‘notes’ to specify metabolite information, gene associations and named parameters that specify flux constraints, as popularised by the COBRA toolbox. We here critically examine over a 100 models and discuss various issues with respect to model quality. A major problem is the lack of standard identifiers while describing metabolites in the models, which makes it difficult to compare or connect different models. Some models also lack clear specifications of the constraints used in the model, which greatly reduces the ability to reproduce the results reported in the manuscript. In addition, they fail to accommodate the changes in the constraints, when used in a condition different from that specified, such as an alternate medium. Some of the models also lack attributes such as Gene Protein Reaction associations, and enzyme information, which are critical for model simulation. These analyses emphasise the need for a widely acceptable standard for representing constraint-based metabolic models, with stringent requirements for identifiers for metabolites/genes etc, and perhaps the ability to incorporate alternate model configurations such as different growth media, and the corresponding constraints. These would streamline the standards of the growing number of metabolic network reconstructions.

Generating Systems Biology Markup Language Models from the Synthetic Biology Open Language

Nicholas Roehner

Synthetic biology is a highly interdisciplinary field with researchers whose backgrounds span a range of scientific and engineering disciplines, from molecular biology and genetic engineering to computer science and electrical engineering. In order to facilitate interdisciplinary collaboration, genetic design automation (GDA) software can help automate the process of creating quantitative models that conform to qualitative descriptions of genetic function. Towards this end, we have developed a computational methodology for generating quantitative Systems Biology Markup Language (SBML) models from Synthetic Biology Open Language (SBOL) modules that document the qualitative molecular interactions of genetic circuits. As implemented in our GDA software, iBioSim, this methodology captures one of many possible mappings from SBOL to SBML or another modeling standard. In the future, other mappings will be developed to enable generation of the most appropriate model for a given design task.

A systems identification-based data pipeline to create digital cell lines from cell culture data

Edwin F. Juárez Rosales

A major challenge facing computational biology today is the (re)use of measurements and mathematical models from different laboratories, due in part to a lack of standardized representations for multicellular experimental and simulation data. To help overcoming this challenge, the MultiCellDS project introduced and is developing the concept of a digital cell line: a standard representation of cell phenotype with microenvironment conditions. A key step in creating a digital cell line is extracting cell cycle parameters from experimental data. To perform this task, we have developed a data pipeline that incorporates cell flow cytometry, population counts, and viability data into a mathematical model via systems identification techniques. The experimental data are used to calibrate a cell cycle model by setting the measured population counts as the target of an optimization code. The optimal solution of the optimization problem is a vector of mean times spent in each part of the cell cycle (G0/G1, S, G2, M) by each cell, the duration of apoptosis, and the apoptosis rate of G0/G1 cells. We have tested this pipeline on MCF7 cells under normoxic conditions. This process can be repeated under a variety of microenvironmental conditions to create a digital line, and then extended to many different cell lines to build a digital cell line database with consistent, biologically- and mathematically-relevant cell data elements. Just as standardized biological cell lines can help biologists to design experiments whose results can be replicated, reproduced, and compared, these digital cell lines can provide common ground for mathematical modelers to compare, combine, and expand on each other’s simulation results.

Visual analysis of biological networks using VANTED and SBGN-ED

Falk Schreiber, Tobias Czauderna, Michael Wybrow, Tim Dwyer, and Kim Marriott

To support SBGN, methods and tools for editing, validating, exploring, and translating of SBGN maps are necessary. We present methods and algorithms to support working with SBGN maps in systems biology. They are implemented in VANTED (www.vanted.org), an integrative framework for systems biology applications which aims at the integration, analysis and visual exploration of experimental data in the context of biological networks as well as the modelling, simulation and analysis of molecular biological processes. One VANTED extensions is SBGN-ED (www.sbgn-ed.org), which allows creating, editing, and exploring all types of SBGN maps. Furthermore the syntactical and semantical correctness of created or edited maps can be validated. Already existing non-SBGN maps from the KEGG database can be translated into SBGN PD maps including automatic layout. A visualisation of SBML models in SBGN PD is also provided. Additionally the tool allows exporting of SBGN maps into several file and image formats including the SBGN-ML format (LibSBGN, www.sbgn.org/LibSBGN).

Just-In-Time compilation and simulation of SBML

Endre T Somogyi, Herbert Sauro

We present libRoadrunner, an open-source and cross-platform library for the JIT compilation, simulation and analysis of models expressed in SBML. As simulations of cellular systems become more complex, particularly in multicellular models, the need for reusable and high performance simulation engines is becoming clear [REF to CompuCell3D application]. The libRoadrunner library has been designed to be extensible and offers superior performance to standard desktop simulators. In this talk we describe the architectural design of the library, the challenges involved in JIT compilation of declarative languages, and an overview of the interactive Python API.

To our knowledge, lib roadrunner is the first SBML JIT compilation engine. JIT compilers are rare for declarative languages. Declarative languages present unique challenges for dynamic compilation and many of the constructs, such the symbol table from traditional compiler design are expanded to deal with the chalenging scoping issues present in SBML. We will give an overview of SBML semantic analysis and intermediate code generation, and finally machine code generation.

The JIT compiled model is combined with variety of plug-able integrators and an extensive API designed for both interactive and high performance use to form the libroadrunner library.

High-throughput whole-cell spatial modeling

Devin Sullivan

Modeling and simulation of cellular dynamics is a major goal of systems systems biology. High amounts of spatial heterogeneity and low copy numbers of certain molecules both contribute to the need for spatially resolved approaches to modeling these systems. A major limitation of these spatially resolved approaches has been the dearth of realistic geometries of cells and their subcellular organization. Traditionally these geometries have been either manually segmented from images or manually fabricated both of which are tedious. In this work we use image-derived generative models trained from fluorescent imaging data using CellOrganizer to create an automated pipeline for the simulation of biochemical networks within spatially realistic cellular instances. This high-throughput pipeline allows us to study both reaction network dynamics in a spatially resolved way and the impact of cellular organization on reaction network dynamics by sampling specific regions of our model parameter space, corresponding to variation of specific geometric properties of the cells and their subcellular compartments.

PharmML - An Exchange Standard for Models in Pharmacometrics

Maciej J Swat, Sarala Wimalaratne, Niels Rode Kristensen

Objectives: The lack of a common standard for exchange of models between different software tools used in population pharmacokinetics/ pharmacodynamics (e.g. BUGS, Monolix and NONMEM) has been a longstanding problem in the field. PharmML is intended to become such standard. It is being developed by the DDMoRe consortium, a European Innovative Medicines Initiative (IMI) project. Methods: PharmML has been developed based on requirements provided by the DDMoRe community, including numerous academic and EFPIA partners, in the form of use cases for various estimation and simulation tasks (encoded in languages such as MLXTRAN and NMTRAN) and textbook/reports outlining the mathematical/statistical background ([1], [2]). The standard is developed as an XML schema definition, and existing standards are reused where possible (e.g. UncertML is used to encode variability/uncertainty). Results: The current version supports Maximum Likelihood Maximization for models used in analysis of continuous and discrete longitudinal population PK/PD data with - Structural models defined as a system of ordinary differential equation (ODE) and/or as algebraic equations. - Flexible parameter model allowing for implementation of arbitrary parameter type used in the majority of models with discrete or continuous covariates. - Nested hierarchical variability model capable of expressing very complex variability structures. - Observation model supporting untransformed or transformed continuous, categorical, count or time-to-event data. - Trial design model, based on a CDISC standard ([3]), allowing for definition of common designs such as parallel or crossover with virtually any administration type. - Typical modelling steps such as estimation or simulation based on inline or externalised experimental data sets. Conclusions: The current PharmML specification allows already for the implementation of standard pharmacometric models and is a solid base for further development of PharmML over the remaining two years of the project. Subsequent releases will support delay and stochastic differential equations, optimal experimental design, etc. References: [1] Lavielle, M. and Inria POPIX Team (June 2013). Mixed effects models for the population approach. URL: http://popix.lixoft.net/. [2] Keizer, R. and Karlsson, M. (2011). Stochastic models. Technical report, Uppsala Pharmacometrics Research Group. [3] CDISC SDM-XML Technical Committee (2011). CDISC Study Design Model in XML (SDM-XML), Version 1.0. Technical report.

Atomizer:Extracting Implicit Molecular Structure from Reaction Network Models

Jose-Juan Tapia

Atomizer is an expert system for extracting molecular structure information from reaction-network models, like those encoded by the Systems Modeling Markup Language (SBML), in order to create a translation using the rule-based modeling paradigm. Molecular structure extraction works by analyzing reaction stoichiometry, the conventions and patterns used to name a model’s species and author-supplied annotations. Atomized models can be visualized in a compact form through contact and process maps, which show the basic molecules, components, and interactions used to construct a model. Application of Atomizer to the curated subset of the BioModels database yielded additional structure for about 65% of species present in models containing more than 30 species. We anticipate that the library of translated rule-based models we have generated using the Atomizer will be useful to the biological modeling community by providing a more accessible view of the available models and by facilitating model reuse, model alignment and merging.

JSBML 1.0: Providing a Smorgasbord of Options to Encode Systems Biology Models

Alex Thomas

SBML is a widely accepted data format for storing, transferring, and utilizing systems biology models. The most recent speciﬁcation of SBML (Level 3 Version 1) has introduced the possibility to extend the core language with speciﬁc packages in order to support more modeling aspects. The corresponding update has enabled SBML to encode flux balance analysis models, logical regulatory networks, Petri nets, rule-based models, biochemical map rendering and layout, probabilistic models, spatial processes, and hierarchical modeling frameworks. The Java™-based application programming interface JSBML provides an easy to use, platform-independent open-source implementation of the latest SBML speciﬁcation. It employs all currently speciﬁed SBML packages and the full SBML core, including all levels and versions. This project has already been utilized by a number of research groups and software projects. In addition to implementing the most recent SBML specification and incorporating representative test cases for released SBML Level 3 packages, the 1.0 release of JSBML will incorporate updates which improve libSBML correspondence, support array and matrix writing, and present an efficient JSBML/CellDesigner interface. These additional updates were added through three Google Summer of Code projects and reinforce JSBML’s open source, community involvement. In this presentation, we give an update on the various novelties in the upcoming 1.0 release.

Arrays Package in JSBML

Leandro Watanabe

Currently, mathematical operations in SBML are restricted to operations on scalar values and regular structures cannot be represented efficiently. These facts motivated the development of the arrays package. As part of the Google Summer of Code program, I started the implementation of the package within the Java-based library of SBML called JSBML. In this presentation, I am going to explain my progress on the project and issues that arose in the process.

Discussion about exporting and importing rule-based models with the SBML Multi package

Fengkai Zhang and Martin Meier-Schellersheim, Computational Biology Unit, Laboratory of Systems Biology, NIAID, NIH

The SBML Multi package provides an extension of SBML Level 3 that supports encoding rule-based models with molecular complexes that have multiple components and can exist in multiple states and in multiple compartments. The development of the SBML Multi packages was presented in several recent COMBINE and HARMONY meetings and also actively discussed via the SBML Multi discussion list. The libSBML-multi library has been included in the libSBML experimental release since ver 5.9.0. In this meeting, we will discuss the use of SBML Multi to export and import rule-based models with a particular focus on model implementation in software frameworks such as, for example, the Simmune modeling package.

(This work is supported by the intramural program of NIAID, NIH.)

COMBINE 2014 - Abstracts

COMBINE

COmputational Modeling in BIology NEtwork